Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes
نویسندگان
چکیده
Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.
منابع مشابه
Accelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملSaturated Path-Constrained MDP: Planning under Uncertainty and Deterministic Model-Checking Constraints
In many probabilistic planning scenarios, a system’s behavior needs to not only maximize the expected utility but also obey certain restrictions. This paper presents Saturated PathConstrained Markov Decision Processes (SPC MDPs), a new MDP type for planning under uncertainty with deterministic model-checking constraints, e.g., “state s must be visited before s′”, ”the system must end up in s”, ...
متن کاملDecentralized Markov Decision Processes for Handling Temporal and Resource constraints in a Multiple Robot System
We consider in this paper a multi-robot planning system where robots realize a common mission with the following characteristics : the mission is an acyclic graph of tasks with dependencies and temporal window validity. Tasks are distributed among robots which have uncertain durations and resource consumptions to achieve tasks. This class of problems can be solved by using decision-theoretic pl...
متن کاملDistributed Generation Expansion Planning Considering Load Growth Uncertainty: A Novel Multi-Period Stochastic Model
Abstract – Distributed generation (DG) technology is known as an efficient solution for applying in distribution system planning (DSP) problems. Load growth uncertainty associated with distribution network is a significant source of uncertainty which highly affects optimal management of DGs. In order to handle this problem, a novel model is proposed in this paper based on DG solution, consideri...
متن کاملJoint Bayesian Stochastic Inversion of Well Logs and Seismic Data for Volumetric Uncertainty Analysis
Here in, an application of a new seismic inversion algorithm in one of Iran’s oilfields is described. Stochastic (geostatistical) seismic inversion, as a complementary method to deterministic inversion, is perceived as contribution combination of geostatistics and seismic inversion algorithm. This method integrates information from different data sources with different scales, as prior informat...
متن کامل